Goto

Collaborating Authors

 number density


Initial Conditions from Galaxies: Machine-Learning Subgrid Correction to Standard Reconstruction

Parker, Liam, Bayer, Adrian E., Seljak, Uros

arXiv.org Artificial Intelligence

We present a hybrid method for reconstructing the primordial density from late-time halos and galaxies. Our approach involves two steps: (1) apply standard Baryon Acoustic Oscillation (BAO) reconstruction to recover the large-scale features in the primordial density field and (2) train a deep learning model to learn small-scale corrections on partitioned subgrids of the full volume. At inference, this correction is then convolved across the full survey volume, enabling scaling to large survey volumes. We train our method on both mock halo catalogs and mock galaxy catalogs in both configuration and redshift space from the Quijote 1(h 1 Gpc) 3 simulation suite. When evaluated on held-out simulations, our combined approach significantly improves the reconstruction cross-correlation coefficient with the true initial density field and remains robust to moderate model misspecification. Additionally, we show that models trained on 1(h 1 Gpc) 3 can be applied to larger boxes--e.g., (3h 1 Gpc) 3 --without retraining. Finally, we perform a Fisher analysis on our method's recovery of the BAO peak, and find that it significantly improves the error on the acoustic scale relative to standard BAO reconstruction. Ultimately, this method robustly captures nonlinearities and bias without sacrificing large-scale accuracy, and its flexibility to handle arbitrarily large volumes without escalating computational requirements makes it especially promising for large-volume surveys like DESI.


Using remotely sensed data for air pollution assessment

Bernardino, Teresa, Oliveira, Maria Alexandra, Silva, João Nuno

arXiv.org Artificial Intelligence

Air pollution constitutes a global problem of paramount importance that affects not only human health, but also the environment. The existence of spatial and temporal data regarding the concentrations of pollutants is crucial for performing air pollution studies and monitor emissions. However, although observation data presents great temporal coverage, the number of stations is very limited and they are usually built in more populated areas. The main objective of this work is to create models capable of inferring pollutant concentrations in locations where no observation data exists. A machine learning model, more specifically the random forest model, was developed for predicting concentrations in the Iberian Peninsula in 2019 for five selected pollutants: $NO_2$, $O_3$ $SO_2$, $PM10$, and $PM2.5$. Model features include satellite measurements, meteorological variables, land use classification, temporal variables (month, day of year), and spatial variables (latitude, longitude, altitude). The models were evaluated using various methods, including station 10-fold cross-validation, in which in each fold observations from 10\% of the stations are used as testing data and the rest as training data. The $R^2$, RMSE and mean bias were determined for each model. The $NO_2$ and $O_3$ models presented good values of $R^2$, 0.5524 and 0.7462, respectively. However, the $SO_2$, $PM10$, and $PM2.5$ models performed very poorly in this regard, with $R^2$ values of -0.0231, 0.3722, and 0.3303, respectively. All models slightly overestimated the ground concentrations, except the $O_3$ model. All models presented acceptable cross-validation RMSE, except the $O_3$ and $PM10$ models where the mean value was a little higher (12.5934 $\mu g/m^3$ and 10.4737 $\mu g/m^3$, respectively).


Application of Neural Networks for the Reconstruction of Supernova Neutrino Energy Spectra Following Fast Neutrino Flavor Conversions

Abbar, Sajad, Wu, Meng-Ru, Xiong, Zewei

arXiv.org Artificial Intelligence

Neutrinos can undergo fast flavor conversions (FFCs) within extremely dense astrophysical environments such as core-collapse supernovae (CCSNe) and neutron star mergers (NSMs). In this study, we explore FFCs in a \emph{multi-energy} neutrino gas, revealing that when the FFC growth rate significantly exceeds that of the vacuum Hamiltonian, all neutrinos (regardless of energy) share a common survival probability dictated by the energy-integrated neutrino spectrum. We then employ physics-informed neural networks (PINNs) to predict the asymptotic outcomes of FFCs within such a multi-energy neutrino gas. These predictions are based on the first two moments of neutrino angular distributions for each energy bin, typically available in state-of-the-art CCSN and NSM simulations. Our PINNs achieve errors as low as $\lesssim6\%$ and $\lesssim 18\%$ for predicting the number of neutrinos in the electron channel and the relative absolute error in the neutrino moments, respectively.


Predicting the Radiation Field of Molecular Clouds using Denoising Diffusion Probabilistic Models

Xu, Duo, Offner, Stella, Gutermuth, Robert, Grudic, Michael, Guszejnov, David, Hopkins, Philip

arXiv.org Artificial Intelligence

Accurately quantifying the impact of radiation feedback in star formation is challenging. To address this complex problem, we employ deep learning techniques, denoising diffusion probabilistic models (DDPMs), to predict the interstellar radiation field (ISRF) strength based on three-band dust emission at 4.5 \um, 24 \um, and 250 \um. We adopt magnetohydrodynamic simulations from the STARFORGE (STAR FORmation in Gaseous Environments) project that model star formation and giant molecular cloud (GMC) evolution. We generate synthetic dust emission maps matching observed spectral energy distributions in the Monoceros R2 (MonR2) GMC. We train DDPMs to estimate the ISRF using synthetic three-band dust emission. The dispersion between the predictions and true values is within a factor of 0.1 for the test set. We extended our assessment of the diffusion model to include new simulations with varying physical parameters. While there is a consistent offset observed in these out-of-distribution simulations, the model effectively constrains the relative intensity to within a factor of 2. Meanwhile, our analysis reveals weak correlation between the ISRF solely derived from dust temperature and the actual ISRF. We apply our trained model to predict the ISRF in MonR2, revealing a correspondence between intense ISRF, bright sources, and high dust emission, confirming the model's ability to capture ISRF variations. Our model robustly predicts radiation feedback distribution, even in complex, poorly constrained ISRF environments like those influenced by nearby star clusters. However, precise ISRF predictions require an accurate training dataset mirroring the target molecular cloud's unique physical conditions.


Machine learning for classifying and interpreting coherent X-ray speckle patterns

Shen, Mingren, Sheyfer, Dina, Loeffler, Troy David, Sankaranarayanan, Subramanian K. R. S., Stephenson, G. Brian, Chan, Maria K. Y., Morgan, Dane

arXiv.org Artificial Intelligence

Speckle patterns produced by coherent X-ray have a close relationship with the internal structure of materials but quantitative inversion of the relationship to determine structure from speckle patterns is challenging. Here, we investigate the link between coherent X-ray speckle patterns and sample structures using a model 2D disk system and explore the ability of machine learning to learn aspects of the relationship. Specifically, we train a deep neural network to classify the coherent X-ray speckle patterns according to the disk number density in the corresponding structure. It is demonstrated that the classification system is accurate for both non-disperse and disperse size distributions.


Denoising Diffusion Probabilistic Models to Predict the Density of Molecular Clouds

Xu, Duo, Tan, Jonathan C., Hsu, Chia-Jung, Zhu, Ye

arXiv.org Artificial Intelligence

ABSTRACT We introduce the state-of-the-art deep learning Denoising Diffusion Probabilistic Model (DDPM) as a method to infer the volume or number density of giant molecular clouds (GMCs) from projected mass surface density maps. We adopt magnetohydrodynamic simulations with different global magnetic field strengths and large-scale dynamics, i.e., noncolliding and colliding GMCs. We train a diffusion model on both mass surface density maps and their corresponding mass-weighted number density maps from different viewing angles for all the simulations. We compare the diffusion model performance with a more traditional empirical two-component and three-component power-law fitting method and with a more traditional neural network machine learning approach (casi-2d). We conclude that the diffusion model achieves an order of magnitude improvement on the accuracy of predicting number density compared to that by other methods. We apply the diffusion method to some example astronomical column density maps of Taurus and the Infrared Dark Clouds (IRDCs) G28.37+0.07 and G35.39-0.33 to produce maps of their mean volume densities. INTRODUCTION star clusters (e.g., McKee & Ostriker 2007; Heyer & Dame 2015; Krumholz et al. 2019), and the formation of Giant molecular clouds (GMCs) are one of the most complex organic molecules (e.g., Herbst & van Dishoeck important components of the interstellar medium (ISM) 2009; Jørgensen et al. 2020).


Working with Winding Numbers part2(Algebraic topology)

#artificialintelligence

Abstract: Topological invariance is a powerful concept in different branches of physics as they are particularly robust under perturbations. We generalize the ideas of computing the statistics of winding numbers for a specific parametric model of the chiral Gaussian Unitary Ensemble to other chiral random matrix ensembles. Especially, we address the two chiral symmetry classes, unitary (AIII) and symplectic (CII), and we analytically compute ensemble averages for ratios of determinants with parametric dependence. Abstract: The winding number is a concept in complex analysis which has, in the presence of chiral symmetry, a physics interpretation as the topological index belonging to gapped phases of fermions. We study statistical properties of this topological quantity.


Homodyned K-distribution: parameter estimation and uncertainty quantification using Bayesian neural networks

Tehrani, Ali K. Z., Rosado-Mendez, Ivan M., Rivaz, Hassan

arXiv.org Artificial Intelligence

Quantitative ultrasound (QUS) allows estimating the intrinsic tissue properties. Speckle statistics are the QUS parameters that describe the first order statistics of ultrasound (US) envelope data. The parameters of Homodyned K-distribution (HK-distribution) are the speckle statistics that can model the envelope data in diverse scattering conditions. However, they require a large amount of data to be estimated reliably. Consequently, finding out the intrinsic uncertainty of the estimated parameters can help us to have a better understanding of the estimated parameters. In this paper, we propose a Bayesian Neural Network (BNN) to estimate the parameters of HK-distribution and quantify the uncertainty of the estimator.


Target-Independent Active Learning via Distribution-Splitting

Cao, Xiaofeng, Tsang, Ivor W., Xu, Xiaofeng, Xu, Guandong

arXiv.org Machine Learning

To reduce the label complexity in Agnostic Active Learning (A^2 algorithm), volume-splitting splits the hypothesis edges to reduce the Vapnik-Chervonenkis (VC) dimension in version space. However, the effectiveness of volume-splitting critically depends on the initial hypothesis and this problem is also known as target-dependent label complexity gap. This paper attempts to minimize this gap by introducing a novel notion of number density which provides a more natural and direct way to describe the hypothesis distribution than volume. By discovering the connections between hypothesis and input distribution, we map the volume of version space into the number density and propose a target-independent distribution-splitting strategy with the following advantages: 1) provide theoretical guarantees on reducing label complexity and error rate as volume-splitting; 2) break the curse of initial hypothesis; 3) provide model guidance for a target-independent AL algorithm in real AL tasks. With these guarantees, for AL application, we then split the input distribution into more near-optimal spheres and develop an application algorithm called Distribution-based A^2 (DA^2). Experiments further verify the effectiveness of the halving and querying abilities of DA^2. Contributions of this paper are as follows.